ABSTRACT OF THE DISCLOSURE 

Systems and methods create high quality audio-centric, image-centric, and integrated 
audio-visual summaries by seamlessly integrating image, audio, and text features extracted 
from input video. Integrated summarization may be employed when strict synchronization of 
5 audio and image content is not required. Video programming which requires synchronization 
of the audio content and the image content may be summarized using either an audio-centric 
or an image-centric approach. Both a machine learning-based approach and an alternative, 
heuristics-based approach are disclosed. Numerous probabilistic methods may be employed 
with the machine learning-based learning approach, such as naive Bayes, decision tree, neural 
10 networks, and maximum entropy. To create an integrated audio-visual summary using the 
alternative, heuristics-based approach, a maximum-bipartite-matching approach is disclosed 
by way of example. 
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